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Abstract 


Several  clutter  metrics  were  evaluated  and  compared  against  the  probability  of  detection  of 
ground  combat  vehicle  targets  in  test  scenes  created  in  a  natural  field  environment.  This  paper 
presents  the  methods  of  testing  subjects  and  the  methods  of  computing  the  metrics.  Finally, 
limited  results  of  the  initial  testing  and  the  comparison  of  this  against  the  metrics  are  given 
thereby  showing  the  effectiveness  of  these  metrics  on  this  set  of  targets  and  on  all  targets  in 
general. 

Introduction 

One  of  the  underlying  goals  of  discovering  metrics  is  to  reliably  compute  information  about 
images  (such  as  probability  of  detection  (Pd)  of  an  object)  and  effectively  assist  the  soldier  in  his 
assessment  of  his  own  Pd  [1 , 2],  If  a  computer  could  accurately  predict  Pd,  it  would  be  able  to 
expedite  the  process  of  concealing  ground  vehicles  within  enemy  territory  and  support  tactical 
planning.  This  also  helps  cut  down  development  time  on  new  camouflage  treatments  and 
concealment  methods.  All  of  this  is  critical  to  the  survivability  of  the  soldier  and  ground  vehicles. 

The  metrics  in  this  paper,  called  Target  Structure  Similarity  Metrics  (TSSM),  are  derived 
from  the  popular  Structural  Similarity  Metric  from  [3]  These  metrics  take  into  account  certain 
hypothesized  characteristics  of  the  human  vision  system  (HVS)  such  as  sensitivity  to  edges  and 
sensitivity  to  areas  of  high  contrast  [3].  TSSM  use  these  qualities  to  measure  image  quality  by 
comparing  a  non-distorted  reference  image  against  a  distorted  image.  The  metric  is  then  a 
measurement  of  how  closely  specific  qualities  of  the  distorted  image  resemble  those  of  the 
reference  image.  The  TSSM  clutter  metric  is  based  on  the  signal  processing  features  of  human 
vision  aided  by  computer  comparison  of  the  images. 

Test  Images 

The  test  images  were  taken  at  Eglin  Air  Force  Base  in  Florida  using  a  Panoscan  camera  with 
resolutions  reaching  52342  x  6000  pixels.  These  images  were  found  to  exceed  the  computing 
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power  of  the  current  resources  available  so  the  images  were  modified.  The  images  were 
compressed,  cropped  and  the  colors  were  rebalanced  using  a  program  called  Photomatix  Pro 
(http://www.hdrsoft.com/).  They  were  then  placed  into  a  program  developed  specifically  for 
testing  human  subjects  on  their  ability  to  locate  and  identify  targets.  The  images  were  comprised 
of  pictures  of  the  same  background  with  military  vehicles  concealed  in  different  locations  in 
different  pictures. 


Fig.  1:  The  photosimulation  lab  consists  of  a  180  degree,  wrap-around  projection  screen 
illuminated  by  three  high  resolution  projectors. 

Survivability’s  Visual  Perception  Lab  (VPL)  (Fig.  1)  was  used  to  test  metrics  because  it  is  more 
economical  than  field  testing.  [4]  The  test  was  intended  to  represent  military  vehicles  as  if  they 
were  seen  by  an  unaided  eye  from  distances  within  the  range  of  1-3  kilometers.  Subjects  were 
given  a  set  of  instructions  that  told  them  there  could  be  zero  to  four  targets  and  that  they  would 
be  given  four  minutes  to  locate  these  targets  regardless  of  their  success.  Each  subject  was  given 
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six  guesses.  Because  the  images  reuse  the  location,  all  efforts  were  taken  to  ensure  that  the 
subject  did  not  know  his/her  own  success  rate.  The  images  were  displayed  on  a  projection 
screen  in  the  visual  perception  lab  and  subjects  were  given  a  joystick  and  a  mouse.  The  joystick 
was  used  to  pan  the  image  back  and  forth  while  the  mouse  was  used  to  point  to  the  target.  One 
of  the  buttons  on  the  joystick  could  be  pressed  to  indicate  that  this  was  the  target.  The  program 
kept  statistics  on  success  rates  and  the  time  each  subject  took  to  make  a  detection  decision.  53 
images  were  randomly  shown  to  the  viewer  with  a  noise  screen  in  between  each  to  neutralize 
background  retinal  images.  A  typical  background  is  shown  below  in  Figure  2.  Actual  visual 
stimuli  were  in  color. 


Fig.  2:  A  typical  background  of  the  test  images.  Real  images  have  target  vehicles  in  the  scene. 
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TSSM 


The  TSSM  divides  an  image  into  blocks  and  assesses  the  similarity  metric  values  of 
these  blocks  using  the  region  the  target  is  located  in  the  image  as  the  reference.  The  clutter  can 
then  be  determined  by  the  overall  average  similarity  metric  measurement  of  all  of  these  pieces  of 
the  background  compared  against  the  target’s  area  in  the  image.  To  simplify  the  concept:  TSSM 
clutter  is  how  often  and  how  much  the  target  resembles  various  regions  of  the  scene. 

The  algorithm  for  TSSM  was  adapted  for  field  image  testing  from  [3]  and  [5],  TSSM  is 
implemented  by  taking  the  target  T  where  i  is  the  i’th  pixel  in  the  target  block  and  the  blocks  By 
where  i  is  the  i’th  block  of  the  entire  image  and  j  is  the  j’th  pixel  of  that  block.  These  blocks  are 
twice  the  dimensions  of  the  target  to  account  for  the  HVS’s  ability  to  adapt  its  response  to  the  size 
of  the  object  being  searched  following  appropriate  training. [5]  TSSM  mathematically  defines  the 
following  characteristics  and  combines  them  in  a  single  metric. 


Luminosity  or  Average  Intensity: 


i(r,s,)= 
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where  crx  is  the  standard  deviation  of  x  and  ax  is  the  variance  of  x. 
and  Structure: 


where, 


2crTBj 


(1) 


(2) 


(3) 
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These  three  components  are  combined  to  yield  an  overall  similarity  measure.  Since  zero  is  a 
possible  value  for  the  denominator,  they  add  an  arbitrary  constant,  C.  Here  C  =  0.02 


These  values  are  then  combined  again  as  either  a  root  mean  square,  an  arithmetic  mean  or  a 
geometric  mean  value. 


tssim{t,b) 


(aT2  +  a ll2  hir  + Mb  2  )+C 


Schmieder-Weathersby 

One  of  the  metrics  used  is  the  Schmieder-Weathersby  metric  [2]  which  is  defined  here  as: 


clutter 


(6) 


where  N  is  the  number  of  cells  across  the  image  where  a  cell  is  twice  the  maximum  target 
dimension. 


Entropy 

Additionally,  the  entropy  metric  on  image  quality  was  tested  to  assess  the  effect  of  information 
entropy  on  Pd.  The  poorer  the  picture  quality,  the  harder  it  should  be  to  find  a  target  in  the 
picture.  This  is  important  because  the  images  used  are  processed  and  don’t  necessarily  reflect 
the  true  visibility  of  what  test  subjects  saw  on-site.  The  entropy  metric  is  defined  here  from  [6]: 

W(/)=-Z'P(/,)logIi>(/,)  (7) 

t= 1 

where  P(ft)  is  the  probability  that  the  grey  scale  value  (luminance  state)  of  the  t’th  pixel  appears  in 
the  image,  f. 
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Metrics 


The  test  images  were  cropped  down  to  a  200  x  300  pixel  bitmap  and  processed  by  a  MATLAB 
program  developed  for  implementing  the  metrics.  The  correct  target  was  chosen  subjectively 
(best  approximation  of  a  rectangle  around  the  target)  and  then  the  background  was  divided  up  in 
a  number  of  ways  for  comparison.  These  methods  included  using  target  sized  regions  of  space 
immediately  surrounding  the  target,  using  the  entire  bitmap  and  dividing  it  up  into  target  sized 
regions  and  also  using  regions  of  twice  that  size.  Comparing  the  target  with  the  background  in 
intensity,  contrast  and  structure,  the  program  generated  a  value  known  as  the  TSSM  average 
which  is  then  compared  against  the  actual  Pds  to  find  a  correlation.  It  should  be  noted  that  while 
the  subjects  were  shown  stimuli  in  color,  the  program  discards  all  color  information  from  input 
images  at  this  time.  Future  modifications  of  the  code  will  permit  color  processing  in  metric 
evaluation  by  the  combination  of  several  image  color  planes. 

Results 

As  of  the  time  that  this  paper  was  written,  the  results  indicate  that  there  is  good  correlation  with 
the  test  subjects  in  the  field  compared  to  the  test  subjects  in  the  visual  perception  lab.  This  is  an 
important  correlation,  because  it  validates  the  use  of  data  gathered  at  the  lab  which  is  much  more 
easily  and  cost-effectively  obtained  than  data  gathered  on-site  in  the  field.  Further  analysis  is 
required  to  determine  the  correlation  of  the  metrics  compared  to  the  performance  of  the  tests 
subjects  at  the  laboratory. 

Conclusion 

Further  analysis  must  be  completed  before  a  more  conclusive  statement  may  be  made  about 
the  degree  of  correlation  to  experimental  data  of  the  metrics  mentioned  in  this  report.  In  the 
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future,  more  studies  will  be  conducted  to  find  other  metrics  and  better  implementations  that  are 
more  efficient  and  faster.  Finally,  more  research  will  be  completed  for  methods  of  testing 
subjects  in  a  visual  perception  lab  as  well  as  for  processing  images  for  use  in  image  metric 
programs. 
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